Dynamic index, LZ factorization, and LCE queries in compressed space
نویسندگان
چکیده
In this paper, we present the following results: (1) We propose a new dynamic compressed index of O(w) space, that supports searching for a pattern P in the current text in O(|P | logw+logw log |P | logN(log M)+occ logN) time and insertion/deletion of a substring of length y in O((y + logN log M) logw logN log M) time, where N is the length of the current text, M is the maximum length of the dynamic text, z is the size of the Lempel-Ziv77 (LZ77) factorization of the current text, and w = O(z logN log M). (2) We propose a new space-efficient LZ77 factorization algorithm for a given text of length N , which runs in O(N logw+z logw log N(log N)) time with O(w) working space, where w = O(z logN log N). (3) We propose a data structure of O(w) space which supports longest common extension (LCE) queries on the text in O(logN log N) time. The LCE data structure can also maintain a grammar-compressed representation of a dynamic text efficiently. On top of the above contributions, we show several applications of our data structures which improve previous known results.
منابع مشابه
Longest Common Extensions with Recompression
Given two positions i and j in a string T of length N , a longest common extension (LCE) query asks for the length of the longest common prefix between suffixes beginning at i and j. A compressed LCE data structure is a data structure that stores T in a compressed form while supporting fast LCE queries. In this article we show that the recompression technique is a powerful tool for compressed L...
متن کاملFully Dynamic Data Structure for LCE Queries in Compressed Space
A Longest Common Extension (LCE) query on a text T of length N asks for the length of the longest common prefix of suffixes starting at given two positions. We show that the signature encoding G of size w = O(min(z log N log∗M, N)) [Mehlhorn et al., Algorithmica 17(2):183198, 1997] of T , which can be seen as a compressed representation of T , has a capability to support LCE queries in O(log N ...
متن کاملDynamic Index and LZ Factorization in Compressed Space
In this paper, we propose a new dynamic compressed index of O(w) space for a dynamic text T , where w = O(min(z logN log M,N)) is the size of the signature encoding of T , z is the size of the Lempel-Ziv77 (LZ77) factorization of T , N is the length of T , and M ≥ 3N is an integer that can be handled in constant time under word RAM model. Our index supports searching for a pattern P in T in O(|...
متن کاملA Faster Longest Common Extension Algorithm on Compressed Strings and its Applications
In this talk, we introduce our recent data structure for longest common extension (LCE) queries on grammar-compressed strings. Our preprocessing input is a straight-line program (SLP) of size n describing a string w of length N , which is essentially a CFG in the Chomsky normal form generating only w. We can preprocess the input SLP in O(n log log n logN log∗ N) time so that later, given two va...
متن کاملSmall-space encoding LCE data structure with constant-time queries
The longest common extension (LCE) problem is to preprocess a given string w of length n so that the length of the longest common prefix between suffixes of w that start at any two given positions is answered quickly. In this paper, we present a data structure of O(zτ + n τ ) words of space which answers LCE queries in O(1) time and can be built in O(n log σ) time, where 1 ≤ τ ≤ √ n is a parame...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1504.06954 شماره
صفحات -
تاریخ انتشار 2015